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APPROXIMATION OF THE NEWTON STEP BY A DEFECT CORRECTION PROCESS 


E. ARIAN*, A. B ATTERM ANN t, AND E.W. SACHS* 

Abstract. In this paper, an optimal control problem governed by a partial differential equation is considered. The 
Newton step for this system can be computed by solving a coupled system of equations. To do this efficiently with an 
iterative defect correction process, a modifying operator is introduced into the system. This operator is motivated by 
local mode analysis. The operator can be used also for preconditioning in GMRES. We give a detailed convergence 
analysis for the defect correction process and show the derivation of the modifying operator. Numerical tests are 
done on the small disturbance shape optimization problem in two dimensions for the defect correction process and for 
GMRES. 

Key words, optimal control governed by PDEs, iterative methods, defect correction, GMRES, preconditioning, 
Newton step, SQP. 

Subject classification. Applied and Numerical Mathematics 

1. Introduction. Many optimization problems can be formulated as equality constrained problems with a special 
structure. If one considers optimal control or optimal design problems, the variables are partitioned into the state and 
control or design variables which we denote by <f> and u, respectively. This leads to the following problem formulation 

min u) s.t. h(<j>, u) = 0. 

(<M) 

If one is interested in algorithms with a fast rate of convergence, one would tend to use Newton’s method for these 
problems. Note that this method can be applied in two different ways. Under appropriate assumptions, see Section 2. 1 , 
one can solve for each control variable u the system equation h(cf)(u ), u) = 0 to obtain a state <f>(u) which depends on 
u. This is typical, when h represents a boundary value problem where the control variable is on the right hand side of 
the differential equation. Then one can apply Newton’s method to the unconstrained minimization problem 

min J{u) = T{(f>{ u ),u) , 

u 

where the step s is computed by solving the linear system 

J'{u)s = - J(u) 


for s. 
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Alternatively, one can keep all the variables u ) and consider the necessary optimality conditions for the con- 
strained optimization problem. Then one obtains the nonlinear equation in (<j>, u, A), 

i h{<j>,u) \ / 0 \ 

G(<t>,u,\)= I = [ 0 I 

^ Fu{4>,u) + h*(<f>,u)\ j \ 0 / 

This equation can be solved by Newton’s method. For the step one has to solve the linear system 

u, A)(v, s, w) T = u , A) . 

This approach is the same if one applies an sequential quadratic programming method (SQP) to the constrained prob- 
lem using the Newton multiplier update for the Lagrange multiplier, see [28]. 

It is important to note, that in both approaches at each step of Newton’s method a linear system of equations has 
to be solved which exhibits the same structure for both cases, see Section 2.3. Only the right hand sides differ in these 
cases. If we denote the variables for the linear system by (u, s, w) then one obtains with the Lagrangian 

£(<M,A) = + M<M), 

the linear system 

( \ f v \ / 0 \ / \ 

fc'xufr £uu hu s I = I ~£u I or — J C u I . 

V h 4> hu o ) \ w ) \ 0 / \ h ) 

Since the solver of these linear systems often requires the largest part of the CPU time of an algorithm, it is the 
goal to utilize the special structure of the linear system in the linear system solver. This has been considered by [5] 
where several preconditioners were used and compared numerically and theoretically. Further discussions can be 
found in [10], [22], [26], [27], [18], [17], and [9]. In [16] the authors use a multilevel technique on the necessary 
optimality conditions in connection with Newton’s method under box constraints on the control. The structure of the 
Newton system for optimal control problems is exploited in [15] to design special quasi Newton updates for problems 
including differential equations. It is well known that one does not need to solve the Newton equations exactly at each 
step [8] but only needs to decrease the accuracy in the residual as one approaches a stationary point of the optimization 
problem. This is another reason why we investigate iterative solvers for the Newton equation. 

The gradient of J can be computed sequentially, see Section 2.9, by solving an adjoint equation after solving the 
nonlinear state equation. The system for the Newton step cannot be solved sequentially, since its variables (v, s, w) 
are coupled through the equations. This makes a Newton step quite expensive because an iterative procedure has to be 
applied. In some applications the variables <p and u are separated in J- and h in such a way that the mixed terms in the 
second derivative of the Lagrangian disappear, i.e., 

— C u<f> — 0. 

If one would omit the upper left term then the linear system matrix has the form 

/ 0 0 h*\ 

0 £ uu hZ . 

\ 0 J 
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The resulting approximate Newton equation has the advantage that it can be solved sequentially, see Lemma 3.5. 
Hence it can be applied in an iterative way to improve the accuracy of the solution of the Newton step. If one analyzes 
its convergence, one obtains, see Section 3.3.4, that convergence is obtained if 

P (c^n - i) < i , 

where H denotes the Hessian of J or the reduced Hessian of the constrained optimization problem in terms of T and 
h. Thus, 7i is given by 

(1.1) w = h* h^ x C^h^ 1 h u - h* h^* - C u<t> h ^ h u + C uu . 

Since the condition p{C^H — I) < 1 might be too restrictive we investigate the following strategy in this paper. 

At first we replace C uu by a term C Uyt = C UU {I + eV ), where V and e can be chosen properly. This choice 
in general depends on the application under consideration. Then the system equation changes and we consider a 
separation into an outer loop iteration and an inner loop iteration. The outer loop iteration is given by 

( C<p u \ / v \ / 0 

(1.2) C U(P £ u , c h£ s =1 -£ u + C uu cVs n 

\ h<f> h u 0 / \ w ) \ 0 

which is solved by an inner loop iteration through 

/ o 0 h* \ / £" c* u o \ 

(1.3) 0 C u , ( K x fc+1 =r- C u( j, 0 0 x fc . 

\ h u 0 / V° 0 0/ 

Here, r denotes the right hand side of the original linear system equation. Thus the advantage of the sequential 
solution of the approximate Newton step is retained. In Section 3.3 we analyze the convergence properties of this 
iterative solver for the Newton step. 

The choice of the operator V in the iteration (1.2), (1.3) is crucial for the convergence properties of the resulting 
scheme. We suggest to make this choice by analyzing the optimization problem on the infinite dimensional level, i.e., 
before discretization is applied. The operator V can be derived approximately with local mode (Fourier) analysis of 
the reduced Hessian, following [2], [3]. By that we are using the structure induced by the governing partial differential 
equation (PDE) to accelerate the convergence process. Once V has been determined, it can be applied in various 
manners to accelerate the convergence process, for example by taking the iteration matrix in (1.3) as a preconditioner 
for Krylov subspace methods. However, we concentrate on the defect correction process (1.2), (1.3) because it is 
simple to apply. Once a code is given for the solution of the state and costate equations, the implementation of the 
defect correction process is reduced to successively solving the linearized state and the costate equations, with possibly 
different right hand sides. The defect correction process is formally introduced in Section 3. 

In Section 4.1 we apply this approach to an optimal shape design problem arising in aerodynamics. It is a boundary 
control problem in which the shape of the solid wall is optimized by modifying the right hand side of the Neumann 
boundary condition. It involves the solution of an elliptic boundary value problem in two space dimensions. By using 
a local mode analysis of the reduced Hessian (1.1) which is done in Section 4.7, we obtain an operator V and an 
indication of the convergence properties of the method for a small step size of the discretization. This is verified by the 
numerical results and the rates obtained for the example. In Section 4. 1 0 we present numerical results for the defect 
correction process and for preconditioned GMRES using different choices of the operator V. We obtain a significant 
decrease of the residual in the first iterations. This convergence property is crucial in many applications that involve 
computationally intensive cost functional evaluations and derivative computations. 
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2. General Approach. In this section, a general equality constrained optimal control problem is addressed. The 
necessary optimality conditions are given for this problem together with the optimality conditions for an equivalent 
unconstrained problem. 

2.1. Problem Formulation. We repeat the problem to be considered in its general formulation 
(2-0 min u) s.t. /i(0, u) = 0 . 

(<£,u) 

The constraint h (</> , u) = 0 denotes the state equation where <j> is the state variable and u is the control, or design, 
variable. Under the following assumption, the equation can be solved uniquely in 0 for a given u. Also, the Newton 
step for the minimization problem (2.1) is well defined. In our presentation we follow [19]. 

Assumption 2. 1 . Let X,y,Z be Hilbert spaces . Let T\ * x y -> 1R and h: X x y -+ Z be twice continuously 
Frtchet differentiable. Let h<p, the partial Frichet derivative of h with respect to 0, be bijective and continuous. Let 
h u , the partial Frichet derivative ofh with respect to u, be continuous. 

Remark 2.2. By Assumption 2.1, the inverse o//i^(0 c , u c ) at the point (0 C , u c ) exists. The derivative /^(0 C , u c )(5<f>) 
of h with respect to 0 at the point (0 C , u c ) is linear in the increment 6<f>. Thus , the inverse ofh* (<p c ,u c ) also exists 
and is continuous (see, e.g., [7]). Here and in the following, the superscript * denotes the adjoint operator or space. 

We will in the following often denote /i«*>(0 c , u c ) by and apply similar conventions to other functions. Note also 
that the relationship (fi* ) * = h# holds for h# and the other considered functions (see, e.g., [7]). 

The implicit function theorem (see, e.g., [30, p. 150]) allows to define the following mapping 0. 

Lemma 2.3. Let Assumption 2.1 hold and let <j> c ,u c satisfy h(<j) c ,u c ) = 0. LetlA be an open neighborhood of 
(0c, u c ) € X x y. Then there exists a unique mapping <j>: U — > X that is twice continuously Frechet differentiable in 
a neighborhood U ofu c £ y and satisfies /t(0(u), u) = 0 Vu £ U. Furthermore, the derivative 0' of 0 with respect 
to u is given by 

< 2 * 2 ) 0 f ( w ) = ~h~ l (4)(u),u) h u (<(>(u),u) Vu £ IA. 


In the situation of Lemma 2.3 we can define the following unconstrained optimization problem which is equivalent 
to the problem in its original formulation (2.1); 

( 2 * 3 ) min J(u) = 

2.2. The Necessary Optimality Conditions. The Lagrangian function for problem (2.1) is given by 

( 2 - 4 ) £( 0 , ti, A) = : F(0, u) + A x /i(0, u), 

where A denotes the Lagrange multiplier defined in Z * , the dual of Z. The first order necessary optimality conditions 
for a minimizer of problem (2.1) are given (see, e.g., [20]) by setting the gradient V£ of the Lagrangian function £ to 
zero, i.e., by the equations 


(2.5) 

state: 

£\= h(4>,u) = 0 , 

(2.6) 

costate; 

J 7 $ + h ^ A — 0, 

(2.7) 

design: 

£u — T u -f h * A = 0. 


The gradient, g = J f (u), is given by the following lemma (this is the necessary optimality condition of first order 
for the unconstrained problem (2.3) (see, e.g., [20])). 
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LEMMA 2.4. Let Assumption 2.1 hold. Define foru G y 

i. ) a function that satisfies (2.5) and 

ii. ) a function X (u) as the unique solution of the adjoint equation (2.6), i.e., of 

(2.8) h* ( 4>(u ), u) X(u) = it). 

Then the gradient g of (2.3) is given by 

(2.9) g = J'(u) = T u {<t>(u), u) + h* u ) A (u). 


Proof The assertion follows from the chain rule and (2.2): 

g = J'(u) = -F u (0(u),w) 4- <t>'(u) x -7>(</>(tt),u) 

= *.(#«), u) + /i* (</>(«), u) (~h^ x (<j>(u), it)) ^(^(«), u). 

□ 

2.3. The Newton Step. The definition of the Newton step for the unconstrained problem (2.3) requires the 
computation of the second derivative of the objective functional J . This, in turn, requires the differentiation of 
the adjoint variable A. 

Lemma 2.5. Let Assumption 2.1 hold . Then X(u) defined in the adjoint equation (2.8) is differentiable, and the 
derivative is given by 

(2.10) A \u) = -h~ x {(j)(u),u) [C^uMu), u, X(u)) + </>'(u) u, X(u ))} . 


Proof. Define a map k by «(A,ix) = £^(0(u), u, A), where by (2.8) A = A(u) solves «(A(it),u) = 0. Since 
ka(A, u)(.) = h £ (0(u), u)(.) is invertible by Assumption 2.1 (see Remark 2.2), the assertion follows by the implicit 
function theorem. □ 

With this result we can now express the Hessian of the unconstrained problem (2.3). 

Theorem 2.6. Let Assumption 2.1 hold. Then J defined in (2.3) is twice Fr^chet differentiable, and the Hessian 
is given by 

(2.1 1) TL = ff (u) = h u h ^ h ^ h u h u h ^ £u<£ h ^ h u 4- L uu . 

Proof Apply the chain rule to equation (2.9) and use equations (2.2) and (2.10). □ 

With Hessian, TL , and gradient, g , for problem (2.3) the Newton step, s, is given by 

(2.12) Hs=-g. 

REMARK 2.7. So far we assumed that the state and costate equations are feasible at every SQP step (reduced 
SQP). Thus the Newton equation is given by (2.12). In case of a full SQP algorithm , where feasibility is not required 
at each step, (2.12) has to be modified to 

(2.13) ns = -C u -C u ^h^ 1 h + h x h^ x C^h^ 1 h-h x h^ x C 4l . 

The right hand side of (2.13) consists of the residual of the design equation (this term is equal to the reduced gradient , 
g, when the state and costate equations are solved), two terms that vanish when feasibility is achieved, i.e., when 
h(<f, u) = 0, and the last term that vanishes when the costate equation is feasible, i.e., when £^>(0, u, A) = 0. 
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Theorem 2.8. Let Assumption 2.1 hold. The Newton step, s, defined by (2.12) can be computed by solving the 
self-adjoint system of equations 


(2.14) 



Remark 2.9. In case of a full SQP on a nonlinear problem, the right hand side of (2. 14) should be modified 
— (£0, C u . C\ ) ! to be consistent with (2.13). 

Proof. For s £ y define v G X and w € Z* by 

(2.15) v = ~ h ^ h uS , 

w = (£#« + £^s). 

Then by (2. 1 1 ) and (2. 1 2) the Newton step, s, satisfies the equation 

9 ~ h u £^0 h'p h u s — h* h ^ x C$ u s — £ U 0 1 h u s + £ uu s 

— £ U 0 v 4- £ uu s + h x (— h^ x £00 v — h^ x C$ u s). 


to 


We have shown that solving (2.12) for the Newton step, s, is equivalent to solving (2.14). Writing the right hand 
side of (2.14), (0, -g, 0) T , as (r 1 ,r 2 ,r 3 ) T = r, we denote (2.14) by 

( 216 > JCx = r , 

and the exact solution of (2.16) by x *. Thus, AC is defined as 


(2.17) 


K.= 



Notation 2.10. In the following, the vector of unknowns ( v,s,w) T will often be referred to as x. For the 
description of iterative processes, the superscripts c -+ will denote current and new iterates, respectively, e.g., x c the 
current x-iterate. The solution of an iterative process will be indicated with the superscript *, e.g., x*. In addition, the 
error in the vector x is denoted by e; specifically, e c is the error in the current x-iterate. The error in the components, 
e.g. s, will be denoted by e s . Thus, for instance, 


x — x = 


v c \ 
s c 

w c ) 




= e . 


3. The Solution Method. In order to solve for the Newton step, a defect correction process is employed. The 
defect correction process (see [12], [24]) is derived in this section. Convergence of the process is governed by the 
choice of approximating operator fit. A detailed discussion and convergence analysis is done in this section. 

3.1. The Defect Correction Process. Solving for the Newton step, i.e., solving (2.1 2), is equivalent to solving 

fCx = r in ( 216 >- The idea of defect correction approach is to replace AC by a simple approximation AC. The 
solution of the approximate problem 

0*0 fcx = f 
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is then reached iteratively. It is essential that K 1 be relatively simple, i.e., that it is much easier to find a solution 
to (3.1) than to (2.16). 

We now introduce the defect correction process. 

ASSUMPTION 3.1 . Guided by the treatment in [12], we assume the following. 

i. ) Let JCiZDV—tVcZ continuous and bijective, 8 , 8 Hilbert spaces , T>, T> closed subsets. 

ii. ) The defect 

(3.2) d(x) = fCx — f 

can be evaluated for approximate solutions x £ P to all neighboring problems. The neighboring problem is 
to find x £ P with Kx — r for given r € P. 

iii. ) The approximate problem (3.1) can be solved uniquely for r G P, i.e., we assume the existence of an approx- 

imative inverse K,~ 1 ofK, such that KL~ l 1C x « x for x 6 P and 1C K,~ 1 r « f for f £ P. 

Assuming that we know an approximation x c € V for x* and that we have computed its defect d(x c ) = 1C x c —r = 
— /Cz*, this information can be used for the computation of an update x+ by means of solving a problem (3.1). 
The error e c = x c — x* satisfies e c = 1C~ l (r 4- d(x c )) — K~ l r = /C -1 d(x c ). Instead of performing the difficult 
solve with /C, we use the approximation KL to compute e c = d(x c ) and use this quantity as a correction for x c . 
The iterative usage leads to the scheme 

(3.3) = z c -f e c = (I - /C -1 K)x c + x° 

with j: 0 = /C _1 r as initial iterate. We define 1Z by the relation 7 Z := K, — fc and call K + a splitting of 7C. With 
this notation we write (3.3) as 

(3.4) /Cx + = r-7lx c 

to indicate that we do not really apply /C -1 but solve the system with /C. 

3.2. The Modified System Defect Correction. We will in the following apply the defect correction process not 
precisely in the way it was derived in the above Section 3.1, but introduce two changes. First, the defect correction 
process described in Section 3.1 can be nested, i.e., an inner defect correction loop can be used to find the solution to 
the system (3.4) that has to be solved in each step of an outer loop. This is the point of view we take in the following 
presentation. Secondly, we modify the system (2.14) that we are interested in solving with the help of an additional 
operator V. We call this approach the modified system defect correction process. 

We now turn to the splittings we propose for the solution of (2.14) by the process (3.4). The choice of /C in the 
splitting 1C = K, + 7Z is crucial for both applicability and convergence of the iterative scheme (3.4). Our choice of KL 
is motivated by the structure of the underlying system (2.14). We now supplement Assumption 2.1. 

Assumption 3.2. Let C uu / 0,/!^ exist. 

We will in the following see that the part C uu in the system 1C is crucial for the performance of the solution 
method. The convergence requirement is 

- 1) < i. 

In general, C uu will not model the Hessian, Ti , very well, and convergence is not necessarily ascertained. To ensure 
convergence, and to allow for convergence acceleration, we introduce an operator V, which will be used to modify the 
system. The choice of V will be discussed subsequent to the convergence analysis in Section 3.3. 

ASSUMPTION 3.3. Let an operator V on y and a real scalar e exist such that I + cP is invertible. 
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Here, I is the identity on >>. Note that Assumption 3.3 is necessarily satisfied for small e. The operator V will in 
general be applied to modify C uu . As an abbreviation we use 

£.,«=£«. (I + e-P). 

Thus, £ uu is naturally regained as C U}0 . 

3.2.1. Outer Loop. To solve the system fCx - r in (2.16) efficiently, we modify fC, K. given in (2.17), through 
replacing C uu by C u<( = C uu (I + eV) to 



With 


0 0 0 \ 

( 3 - 6 ) T^o — 0 -C uu eV 0 , 

0 0 0 / 

the equality K-o 4- 7 Zo = fC holds so that ICo, 7 Zo define a splitting of fC. 

For the solution of fCx = r we start a defect correction process given by 

( 3 - 7 ) it 0 x n 0 +1 = r-H 0 x n 0 , 

which is equivalent to 

/ C* u h$ \ / v \ U+1 / 0 \ 

^3- 8 ) C uij) C u , t h£ s I =| -g 0 + C uu ePs n I . 

\ h<p h u 0 / \ w J \ 0 J 

We start at n - 0 with a starting point i>°,s 0 ,w°. In each step of this defect correction process, a linear system has to 
be solved. One possibility is to do this via an inner defect correction loop. 

3.2.2. Inner Loop. For the inner loop we define a splitting of the system matrix K 0 in (3.8). The splitting 
TCo = ICj + 7 Zj we propose is given by 


(3.9) £j = 

The corresponding inner loop is 


0 0 h* 

0 £„,e K 

h"u 0 



(3.10) iCjx^ ^ro-Htx), 

where ro is the right hand side of the outer loop, i.e., i'o is given by r q = r — R-o Xq. This amounts in each step to 

/ 0 Q h* \ / X ! \ / L^ u x\ 

® h* ( x 2 j = j ~go + C uu ePs n — C U( f,xX 

\ h# h u 0 / \ x 3 / \ 0 

Starting at k = 0, use x J — Xq. Set the solution x*j of the inner loop as the new outer loop iterate xj +1 . 
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Remark 3.4. For the splitting defined by (3.9), the iterative scheme (3.4) can be viewed as applying (forward) 
Gauss-Seidel on the system 

( b^ C, £<p(j) ^ 

(3.12) K £ u , e C u4> . 

\ 0 b u h(f) J 

Because (3.12) is, even for e = 0, a non-self adjoint permutation of system (2.17), symmetric Gauss-Seidel (i.e., 
forward Gauss-Seidel followed by backward Gauss-Seidel, see [11], [14]), does not lead to a selfadjoint operator 
M =1 -iC~ l K,in (3.3). 

3*2.3. One Inner Iteration. If only one inner iteration is performed, inner and outer loop can be combined in 
one closed formula 

(3.13) K.x k+1 =r - TZx k . 

This is described by (3.1 1) with s n replaced by x k . The corresponding splitting is AC = AC + TZ with K, = AC/, 
1Z = IZj + Ho, and given by 


( ° 

0 

K \ 



£<pu 


° 

£u,c 

K 

, n = 

Gu4> 

- Cuu ep 

n 

0 

n / 


h u 0 


3.2.4. Applicability. With the splittings defined in the preceding Sections 3.2.2 and 3.2.3, the solution of the 
scheme (3.4), i.e., to K. x* = r — 7Zx c , can be computed at the cost of solving three linear subsystems. 

LEMMA 3.5. Let the Assumptions 2.1, 3.2 and 3.3 hold. The solution to the systems (3.10) and (3.13) can be 

accomplished at the cost of solving three linear subsystems. 

Proof Since the systems are blocktriangular, back-substitution furnishes the solution. 

1. ) The solution x k+l to (3.13), i.e., to fcx k + l = r — 1Zx k , can be computed by successively solving the systems 

b^ Xq —~ T\ Xj T<f>u 

C Ut e x £ +1 = r 2 — C u +x\ + C uu tPx\ - h* x k+1 , 

b<px k + l = r 3 -b u x 2 +1 . 

2. ) The solution x k+1 to (3.10), i.e., to Kj x k+l = ro — Hi x k , can be computed by successively solving the 
systems 

bfi *^/,3 = r 0f i ^/ t i ’ 

2^/ 2 = T'0 , 2 £u<t> \ b u Xj r g , 

h 4> = r O, 3 - b u X k j+2 . 

□ We now turn to the convergence analysis of the iteration (3.4) for the proposed splittings. 

3.3. Convergence Analysis. The convergence of the defect correction process depends on properties of the 
iteration operator M — I ~ = —KL~ l H in the process (3.3). First, we state the basic requirement on M in 

Theorem 3.6. We then address the necessary and sufficient conditions for convergence of the modified system defect 
correction for the processes described in Sections 3.2.1, 3.2.2, and 3.2.3. For this, we investigate the convergence- 
governing matrices Mo, Mi, and M of the processes. The composing operators are derived from the respective 
splittings defined in (3.5) and (3.6), (3.9) and (3.14). 
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The basic convergence requirement on M is described in [30, Cor. 1.1 3], 

THEOREM 3.6. Let Assumptions 2.1, 3.1, and 3.2 hold. If the condition 

(3-15) p(M) < 1 

holds for the spectral radius p of M = I - K.~ 1 K in process (3.3), then the iteration converges, for every r and for 
arbitrary initial element x°, to a unique solution x* of (2. 16). 

In the following we often need the invertibility of the reduced Hessian, H. and of a modification, H t = H + 
C uu eV. The invertibility is guaranteed for small e under the usual second order sufficiency conditions for optimization 
problems because these require, with some constant c, 

(3-16) A(0, u ), (0, !*))(</>*, u*, \*)(6<j>,6u)(64>,6u) > c ||(<5<£,<5u)|| 2 

for all those (So. 6u) that satisfy h t;> do + h u Su = 0. The last condition allows to substitute So = 1 h u Su. thus 

leading to 


h u (Su,Su) — h* C^, u (6u,6u) — C u ^h^ l h u {6u,Su) + C uu (Su,6u) 
= H{6u,6u) ^(ll/i^HP + IIHI 2 ) >c||HI 2 - 


Therefore, H is invertible and so is 'H, for small e. 

Assumption 3.7. Let a real scalar l > 0 exist such that Ti t =H + C uu eV is invertible for 0 < e < e. 

In addition we will need the following lemma to prove the central statements in Theorems 3.9, 3. 1 0, and 3.11. 
LEMMA 3.8. Let Assumptions 2.1 and 3.7 hold. Then the inverse of the operator AC defined in (2.17), written as 


is given by its entries 


1 

( 

Kn 1 

Kf, 1 

Kft 

II 

1 


Kf 2 x 

k — 1 
/v "22 

ir -1 
^23 

1 

\ 

Kf 3 x 

u 1 

X 

jr 


'Ml — 

*Ta = 
fCvl = 

k — 1 — 

^22 ~ 

k — 1 — 

^23 _ 


h+'KH-'Kh-* , 
h^h u n~\ 

h* 1 - h^KH~ l Khl x 4 - , 

n~\ 

h u Tt 1 h*h (j) x C ( j,^,h i p l + x £^ u 7i^ l £ u ^h^ 1 

~^ij> h u h ( j ) C^h^ 1 h u H 1 L u{ ph ^ 1 . 


Here, H. explicitly given in (2.1 1), denotes the reduced Hessian of the constrained problem (2.1). For the operator 
AC 0 in (3.5) which differs from AC only in the central entry £ ut , the inverse is given by 


/Co 1 


Kit 

Kf2* 

Kf 3 x 


K — 1 

^12, e 

k — 1 

^22, e 
If -x 
^23, c 


Kit \ 

k — 1 
^23, e 

^33! c / 


where the entries differ from those ofK 1 insofar as H is replaced byH ( .=H + C uu eV = h£ h~ x h~ x h u - 
^ U ^<t>U ~ £u(f> hu + £uu(I + €p)‘ 
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3.3.1. Convergence Analysis of the Outer Loop. We now investigate the convergence properties of the outer 
loop. The convergence of x n = ( v n , s n , w n ) T can be characterized by the convergence of the sequence s n . This is 
important because in the application (see Section 4) a: is a vector which consists of functions defined also on the whole 
two-dimensional domain whereas s is only defined on the boundary. 

THEOREM 3.9. Under Assumptions 2.1, 3.2, 3.3 and 3.7 the following statements hold. 

1. ) The iteration (3.7) converges if and only if the iteration 

(3.17) s n+1 = K,^C uu eVs n + f 2 
converges, where r = K,Q l r. 

2. ) If the spectral radius of I — i'H ■ C uu tP ) -1 H, defined on the boundary, satisfies 

(3.18) p(I-(H + C uu eV)- 1 H)< 1, 

then the iterates of (3.7) converge to the solution x* o/(2.16). 

Proof. Denoting /Cq 1 by its entries, ( i,j = 1,2, 3), we see that Mo = / — /C 0 1 K. is given by 

/ 0 Mi 0 \ 

(3.19) Mo= 0 M 2 0 

V 0 M 3 0 / 

where 

Mi = ic; 2 ] e c uu eV 

for i = 1, 2, 3. By the iteration (3.7) we have 

*o +1 = Ko l r +(/- Kq 1 1Zo)xq = k,~ 0 x r + M 0 x n 0 
or by (3.8) with Xq = (u”, s n ,w n ) T 

u n+1 = Mi s n + fi ? 

(3.20) s n+1 = M 2 s n + r 2 , 

K; n + 1 = M 3 S n + f 3 , 

where r = = (n,r 2 ,r 3 ) T . It is immediate that the convergence of the sequence Xq = (u n , 5 n , w n ) T is 

equivalent to the convergence of s n which proves the first statement of the theorem. 

The entry M 2 is given by, see Lemma 3.8, 

M 2 = K^ t C uu tV = Hf l C uu eV = I — (H + C uu eV)~ l H . 

By Theorem 3.6 the condition p{Mf) < 1 is sufficient for convergence of the sequence s n and by part 1 .) also for the 
convergence of Xq. □ 

From the result (3.18) it can be seen that we can set the convergence rate of the outer loop by choosing eV 
appropriately. However, the choice of eV influences the convergence of the inner loop as well. 

3.3.2. Convergence Analysis of the Inner Loop. The convergence properties of the inner loop are described in 
the following theorem. 

THEOREM 3.10. Under Assumptions 2.1, 3.2 and 3.3 the following statements hold. 
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1.) The iteration (3.10) converges if and only if the iteration 


(3.21) 


4 +1 = (I + - I)x k 2 + fa 


converges , where r = Kj l ro- 

2.) If the spectral radius of the boundary operator (I + eV)~ l (C~^7i - I) satisfies 
< 3 - 22 ) p((/ + CP )- 1 (£-„>«-/))< 1 , 


then the iterates of (3. 10) converge to the solution x). 

Proof. The inverse of the operator /C/ defined in (3.9) is given by 


(3.23) 


*7 


-1 




~r~ l h x h~ x 

*~u,t n u n 4> 


-h+'KCzl 


u,e 

0 


h * 

0 

0 


Thus, the operator M / = / — JC 7 K, in (3.3) is for the inner loop explicitly given by 


(3.24) 

Mi = 

( C ^ C u4> ) 

* h~ x r 

*~U,t tL U U (f> 4>U 

0 ^ 
0 



\ h ^ 

£>4>u 

0 / 


Denoting the following composed operators by <7, Q, N , C, £/, 



G 

— h<f> ^uj 


Q 


(3.25) 

N 

r- 1 LX L — x p 

n u n <f> ^(pui 


C 

= H 

X ^<p(pt 


u 

= K 

X p 
*~<pU 5 

Mi can be written in the form 



( GQ GN 0 

(3.26) 


Mi = 

Q N 0 




\ C U 0 


Hence the inner iteration with x ) = (x ) ^ , x J 2 , x J 3 ) ' T can be written as 

= GQx k IA + GNx k Il +f u 

( 3 - 27 ) ar^ 1 = Qx k IX + N x k 2 + f 2 , 

Multiply (3.27) by G and substitute the resulting equality into (3.27). This yields 
(3-28) x *y =Gx k+i +fi _ Gf2 

From this equation it is clear that if x) 2 converges, then so does x k x and by (3.27) also x k 3 . 

To show the second statement, use (3.28) in the form x) x = Gx k I2 + f x - Gf 2 to eliminate x k IX in (3.27) which 
gives 


*/,2 — (Q G + N) x k 2 + T2 + Qr\ — QGr 2 - 
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By definition (3.25), the operator Q G + TV coincides with (I + eP) 1 (C U ^TC — I) (see (2.1 1)), because 
QG + N = fcj h~* h~ l h u - C~ \ C u4> h~'h u - C u \ h* h ~ * 

= ^u,e — ^»«) 

= (J + eP )- 1 (£-»«-/)• 

Therefore, if p((/ + eP) -1 (£“J ?< — /)) = p(Q G + AT) < 1, then by Theorem 3.6 the iterates x) 2 converge and by 
the first assertion also the sequence x k j. □ 

3,3,3. Convergence Analysis in Case of One Inner Iteration. In case only one inner iteration is performed, 
convergence is determined as follows. 

THEOREM 3.11. Under Assumptions 2.1, 3.2 and 3.3 the following statements hold. 

1.) The iteration (3.13) converges if and only if the iteration 

s n+1 = - I) s n 


converges. 

2.) If the spectral radius of the boundary operator C u \7i — I satisfies 
(3.29) p(C~yH - I) < 1, 


then the iterates in (3.13) converge to the solution x* of (2.16). 

Proof The inverse of the operator /C = /C/ is given in (3.23). Thus, in this case where only one inner iteration is 
performed, M — I — KL~ l JC is given by 


M = 


( K x h u c-HKh~* C m 


-c u UKh^c 


<f> ^<p(p £"u. 4>) 

<f> C(f>4) ~ Cut*) 

-x p 
l <f> 


K l h u C^{Kh-^C^^C uu eV) 0 \ 

-£u\(KK*c+u + CuuiV) 0 

^ <j> 0 / 


We denote the composed operators by G, Q, C, U as before in (3.25) and let N\ be defined by 


Ni = C^ + C^eV). 

Then Ai can be written in the form (3.26) with TV replaced by TVi. The proof follows the same lines as for the one of 
the previous theorem. 

By definition of G, Q, C, U in (3.25) and of TV : above, the operator Q G + N\ equals 

h u h 4>* h u - £u<t> h~ l h u - C~\ h* h~* Cm - C~\ C uu CP 

= QG+N - Cz\C uu tV 
= C~\(H - C uu ) - £-!£„„ tV 
= (I + CP)~ l C~pi — I 
= Cz\H - 1. 

□ 

3.3.4. Discussion of V . The preceding convergence analysis shows that if V is not present, or, equivalently, if 
t = 0, there is no outer loop in the nested defect correction process. In that case, the convergence requirements (3.29) 
and (3.22) coincide and are given by 

p(czln - 1) < i . 
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In general, £ uu will not model the Hessian, 77, very well, and convergence is not necessarily ascertained. 

In order to solve the system (2.14) efficiently with the defect correction approach, we modify (2.14) through 
replacing C uu by C UiC = £ uu (7 + eP). We have seen in the preceding Sections 3.3.1, 3.3.2, and 3.3.3, that the 
convergence rate of the processes can be determined by appropriate choices of P. However, finding tP that performs 
well both in the outer and inner loop requires solving conflicting tasks. Theorem 3.9 suggests that C uu eP should 
be small in the sense that 77 + £„ u eV is only a small perturbation to 77. Theorem 3.10 indicates that £ u , e should 
approximate TL - C uu fairly well, because (7 + tV)~ l (£ u j77 - I) = £“> (77 - C uu ) is the convergence-governing 
part. The problem to be solved is 

min max [p (7 — (77 + £„ u eP) _1 77) ,p (7 + eP)” 1 (£-^77 -7)} . 

If only one inner iteration is done, £ Ui< is required to approximate 77 as described in (3.29). In both situations, i.e., 
in the nested defect correction and in the case with only one inner iteration, some knowledge of the operators involved 
is necessary for an appropriate choice of V. However, the defect correction process with one inner iteration is easier 
to apply than the nested defect correction process (because it allows for a closed representation), and thus preferable 
whenever applicable. If there is not much information available on the Hessian, choosing a “small” eP and applying 
nested defect correction seems to be more promising. We will discuss the choice of P for our example problem in 
Sections 4.5 and 4.7. 

4. Application: The Small Perturbation Potential Problem* We consider an optimal control problem governed 
by a partial differential equation. The example problem is motivated by problems of aerodynamic optimization. In the 
following, we derive the example problem and state the equations relevant to the approach delineated in Sections 2 
and 3. Subsequently we turn to the discretization and the finite— dimensional solution of the optimization problem. We 
include a convergence estimate for the example using local mode analysis. 

4.1. The Small Perturbation Potential Equation. We start with a short derivation of the state equation that 
governs our optimal control problem. We assume inviscid irrotational flow modeled by the continuity equation, 

VW = o, 

where p is the density of the fluid and v the velocity field. The potential function, is defined with the relation 

v — V4>. 


The density is related to the potential by the isentropic density law (see, e.g., [13]). We consider a slender body aligned 
with the x-axis, and define a perturbation potential <j> by 

^ = Uqc(X + (f))y 

where U ^ is the free-stream velocity. Under the assumption of a dominating ^-component of the velocity field, 
and neglecting terms that are proportional to <f% and to the following first-order transonic small -perturbation is 
obtained (Prandtl-Glauert equation): 

(1 _ -^ 00 )^ 12 : + (j>yy = 0 . 

If y = u ( x ) is toe equation of the surface of the slender body, it is possible to set the surface boundary condition on the 
x-axis, that is at y = 0. The presence of the slender body will appear in the computation only through the boundary 
condition 

( 4 -I) V = (Uvc -f u)u x « C/ooUx. 
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In terms of the perturbation potential, (4.1) is given by the normal derivative 


0n — tt x . 


The boundary conditions at the far-held for the perturbation potential are set such that it does not affect the far-held 
velocity. 

The small perturbation potential equation has been for a long time the basis for potential flow theories as it is a 
simplified form valid for flow helds along slender bodies aligned with the x-axis. We turn to form a simple optimal 
control problem based on that model. 


4.2. The Optimal Control Problem. The small-perturbation potential problem allows us to study a shape op- 
timization problem with a boundary control model defined on a fixed domain, thus avoiding the complication of a 
changing geometry. 

We consider the following minimization problem, 


(4.2) 


mm 

{4>,u) 


J u2ds + r i 2 J <t>xUds, 


subject to the following state equations. 


(4.3) 


(1 - XX + 4>yy 


4 ^Tl 

4 > n 


4> 


0 in = (0, l) 2 , 

u x on T = {(x, 0) : 0 < x < 1}, 

0 on r* U T r , 

o on r t , 


Here, the parts of the boundary T/, T r , and T t are given by F; = {(0, y) : o < y < 1}, r r = {(1,2/) : 0 < y < 1}, 
and r 4 = {(x, 1) : 0 < x < 1}. 

We now give a short explanation of the different terms in the cost functional (4.2). The first term is proportional 
to a pressure matching term since in the small disturbance model the pressure of the flow on the slender body is 
proportional to the derivative of the potential in the flow direction, 0 X . The desired “pressure distribution”, 0^, is 
given. The second term is a penalty on the control and rji is a parameter. The third term is artificially introduced 
to the objective function to better model the structure of problems in applications since in aerodynamic optimization 
problems often non-zero terms £ U( p and £^ u are present. 

43. Existence and Uniqueness of an Optimal Control. Let us assume that the design, u(x ), belongs to the 
subspace of functions that can be spanned by the basis {sin(27rfcx)}JJ =1 , i.e., 

n 

(4.4) u(x) = Uk sin (2n kx), 

k—l 

where {u*}jj =1 are real numbers and n is a positive integer. For that choice of u(x) there exists a solution 0(x, y) of 
the minimization problem (4.2) subject to the state equation (4.3) of the form 


(4.5) 


n 

y) = “fcV’fc (y) cos (2 TTkx ) , 

k—l 


where then functions ipkiy) are given by 


(4.6) 


4>k{y) = — (— tanh(27r7/c) cosh(2n^ky) + sinh(27T7A:y)) 
7 
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for 7 = v/l - M^. By inspection the above <f> satisfies the state equation (4.3). 


Remark 4.1. By the solution (4.5) the operator h# has the symbol (2nk)/ip k (y). (For the definition of the 
symbol of a differential operator see, e.g., [21], p.38.) Since the Fourier transformation is a homeomorphism we 
conclude that the operator h,., satisfies Assumption 2.1. 

Theorem 4.2. There exists a unique solution u*(x) to the optimal control problem (4.2) subject to the state 
equation (4.3). 

Proof. Let us denote the vector of design coefficients by u N : u T N = (tij, •■•,«„). A direct substitution of the 
solution (4.5) into the cost functional (4.2) results in a leading quadratic term of the form uJjQun with Q being a 
positive definite matrix. This proves that the minimization problem has a unique solution. 0 

4.4. Optimality Conditions and the Newton Step. The first-order optimality conditions of the problem ( 2 . 1 ), 
as outlined in Section 2.2, are the state, costate, and design equations (2.5), (2.6) (2.7). The state equation is given 
in (4.3). The adjoint equation for this problem takes the form 


(1 Af OQ )\ XX ~h Xyy 0 

^ 4 ' 7 ) A n = ~ 4>i x ) + T)2 U x 

A n = 0 

From the design equation (2.7) we get the gradient 


in ft, 
on r, 

on <9ft — T. 


(4.8) 


9 = -A x -f- 7?i u + r)2 <j>x on T. 


The Newton step satisfies (2.14) with the operators 


and 

^ d \ J ’ £’ uu Vi ‘ -Hr, — £ifm — ^ 

0 |n 

V2dx\r 

k _ _ f (1 + dyyln Ik IX / 

Explicitly, the Newton step, s, satisfies the following system of PDEs: 

0 |n 

~^x|r 

(4.9) 

(1 ^oo) W XX 4" Wyy — 0 in ft, 

Wn - Vxx + T) 2 S x = 0 on r. 


(4.10) 

Vi s + r /2 v x — w x = —g (x, 0 ) on T, 

^( 1 ) = <?( 1 ). 


(4.11) 

(1 ~~ Mle) v xx + Vyy ~ 0 (rc, ?/) in ft, 

v n — s x = 0 (x, 0 ) on T, 

s( 0) = 0. 



The defect correction process described in Section 3 will now be applied to the example problem introduced in 
Section 4.2. Convergence of the solution process (3.4) is governed by the singular values of the operators Mo* Mi, 
and M derived in Sections 3.3.1, 3.3.2, and 3.3.3. The operators for the small perturbation potential problem are given 
in Section 4.4 so that Mo, Mi, and M are easily found for the example. 

Knowledge of the operators allows to choose V such that small convergence rates are obtained. We now use local 
mode analysis of the convergence-governing operator to choose the operator V. Similar analysis has been introduced 
in the past to approximate the reduced Hessian of optimization problems governed by PDEs (see, e.g., [ 1 ], [2], [3]). 
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4.5, Choice of V by local mode analysis of the PDEs. The local mode analysis is performed locally around a 
point on the boundary T, ignoring the boundary conditions on d£l - I\ Thus for a boundary value problem, as we 
have, it is only an approximation. We deliberately do not insist on the exact analysis since it can not be done in general 
applications while the given analysis can be applied (after linearization and freezing of coefficients when the problem 
is nonlinear). 

We choose to use the splitting of Section 3.2.3, i.e., the case of one inner iteration. We have seen in that section 
that convergence depends on the eigenvalues of the operator T = C~ — I . We study one Fourier component of the 
error, 

e(x,y) = e(u 1> w 2 )e i ^ x+u ^. 

The interior equation in (4.3) relates u>\ and u >2 by 

(4.12) (1 - = 0. 

We choose the decaying mode solution for Equation (4.12), 

(4.13) (* = <v^TM£| Wl |. 


The boundary equation implies that 


y/l - \uj\ | ^ = ioJ i u . 

We arrive at the Fourier symbols of the operators in the convergence-governing operator T : 



K 


II 

-K 

= , 

II 

-e 

^11 

= 

£q !>u 

£u<p 

= “7/2^1 , 

r — 

*~uu — 

Vi • 



(4.14) 


These imply that the Fourier symbol of the Hessian is given by 

— h x h ( p x £($>$ h u h u £u<t> h u + £ uu — — 2 7 / 2 -j r f/i • 


By (3.29), the choice of the operator V is such that V « £ U ^H. We obtain the approximated symbol of the desired 
operator V as ' 


(4.15) 


T)i \ 


■ 27/2 


\ui\) 


Since the second term in (4. 15) does not correspond to a differential operator, 
first term, our first approximation of V is given by 


and because it is of lower order than the 


(4.16) V=- — D xx . 

m 

We now turn to use the same local mode analysis to approximate the asymptotic convergence rate of the defect 
correction process for low mesh size h. The above symbols (4. 1 4) imply that the symbol of the convergence-governing 
operator T is, where cV is taken to be , 


f = 4.1 w - 1 = (i + «p) _1 42 n-i. 
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Using Parseval identity we estimate an upper bound for the convergence rate of the iterates of the defect correction 
process by 

( 4 -17) /x<max (t*(w)f (w)V , 

where uj = ^ for k ranging from 1 , . . . , n. Here, n is the number of grid points on the boundary T, and T* is the 
complex conjugate of T. 

4.6. The Discretization. We define a uniform grid on the domain fi, containing m grid points. The perturbation 
potential, <f>, is defined on the grid vertices, and the control is defined on the mid-interval points on the boundary T, 
i.e., <f> e JR m and u 6 !R n , with n = y/m — 1. We then apply a second order finite difference discretization to the 
problem (4.2), (4.3). The stencils can be found in (4.24) and (4.26). The resulting finite dimensional problem is to 
minimize a quadratic functional under linear constraints, 

< 4 - 18 ) min F(<j) N , u N ) s.t. A(p N + Bu N = b, 

(<f>N s UjV) 

where the discretized objective function can be written as 

( 4 * 19 ) F = 2 + y uJjH uu u N -j- 772 + <£^c + u%d. 

The discrete Lagrangian is given by 

(4.20) it/v, A/v) = F + A^(A<^at 4- Bun — b). 

Note that for this (quadratic) problem the second partial derivatives L^ u and L uu of the Lagrangian coincide 
with H^ u and H uu , respectively. 

Applying the first order optimality conditions (Karush-Kuhn-Tucker conditions, see, e.g., [20]), we have to solve 
the system 

Tf2 

? }l L uu 
A B 

where 

<t>N g JR m , u N e JR n , X N e M m , c e JR m , deIR n , be M m , 

e iR mxm , L un 6 JR nXn , L^ u € JR mXn , A e JR mXm , B e ]R mXn . 

If A is invertible, the discrete state equation A0 N + Swat = 6 can be solved for 0 n (uat) = A “ a (6 - Buyv). 
Thus, we can define the discrete unconstrained problem 

( 4 2 1 > min J(u N ) = F(<f> N (u N ),u N ). 

In order to get the gradient, gw , of the unconstrained problem (4.21), the state and the costate equations must be solved 
exactly. This means that for a given control un the following set of equations has to be satisfied, 

l A T \ / 4>n \ _( -c-L^un \ 

\ A 0 / \ Ajy j y b — Bun ) 

Then the gradient can be computed as 

9n = B T \ N + t ) i L uu un 4- L^ u <f>N 4- d. 
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The Hessian of the discrete unconstrained problem (4.21) is given by 

(4.22) H=B t A~ t B-L u<j> A~ l B-B T A~ T L^ u + L uu . 

The Newton step, sn, of the discretized unconstrained problem (4.21) can be computed by solving the system 

( L ^ L^ u A t \ / vn \ / Q \ 

(4.23) L U(t> L uu B t I sn I = ~9n 

\ A B 0 ) \w N ) V 0 / 

4.7. Choice of V by local mode analysis of the Finite Difference Equations. We now perform the local mode 
analysis, similar to the local mode analysis in Section 4.5, taking into account the specific discretization. By analyzing 
the finite difference equations we hope to get a better approximation of the reduced Hessian and as a result a better ap- 
proximation of the operator V. Note that although the operators analysed in this section are finite difference operators 
and not differential operators, we still denote them as before to avoid excessive notation. 

The discrete interior equation has the form 

(4.24) -f- i) + ij + <f>i- ij) — 0 

where 

c={l-Ml 0 )b, a = {l-M 2 00 ){-2b)-2b. 

We study one Fourier component of the error, 

e(x,y) = e(0i,02)e' (01 ^ +e2 £\ 
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The other operators in the convergence-governing operator, L~ \H - I, have the following symbols: 


(4.29) 


hu — h£ — s ’ n (^) = \/2(l - cos 0i ) , 

= ~D XX — ^(1 - cos^i) , 

^ ucp — 7/2 i 
£"uu — 7/\ . 


The reduced Hessian (2.11) contains the operator h ^ 1 h u and its adjoint. The expression h ^ 1 h u can be simplified to 

h^K = - 1 . 

y (1 ~ M Zo) ~ K 1 - ~ cos 6i ) 

This leads to our second choice of the operator P, 

< 4 - 30 > V = ~m ( (1 - m ~ )/ + t ( 1_m - )2jDi1 ) 1jD -- 

Having chosen the operator V, the asymptotic convergence estimates follows as in Section 4.5. 

4.8. The Defect Correction Process. On the discrete level, the linear system (4.23), which we denote by 
(4-31) K xn = r/v , 


has to be solved in order to compute the Newton step, s^. The defect correction process is described by the iteration 
( 4 - 32 ) K x n = r N ~ Rx c n , 

where K and R define a splitting of K, i.e., K - R — K. Convergence of the solution process (4.32) is governed 
by the singular values of the operators Mo , Mi, and M derived in Sections 3.3.1, 3.3.2, and 3.3.3. Their discrete 
counterparts M 0 , M/, and M are analyzed with local mode analysis in Section 4.7. Convergence of the outer loop 
depends on the spectrum p{Mo ) = p(I — (H + L uu 1 1’) 1 II) . Convergence in the inner loop is governed by 
p(M ] ) = p{(I + eP) 1 (L U *H — /)) . If only one inner iteration is done, we investigate p(M) = p(L~\H — I) . 
One specific choice of V was given in Section 4.5, motivated by the local mode analysis of the differential operators. 
The corresponding matrix is 


(4.33) 


Pi = - 


1 -ML 


D 


xx,N * 


Analyzing further the finite difference equations, the local mode analysis in Section 4.7 motivates secondly to use the 
matrix 


< 434 > P * = - (^ (1 - Ml)I + £(1 - MlfD xx ^ 1 D XXtN . 

The main computational work in the defect correction process with the splittings defined in Sections 3.3.1, 3.3.2, 
and 3.3.3 is the solve with h $ and its adjoint. This is the solution of the linearized state equation and of the adjoint 
equation, which amounts to the solution of two linear PDEs in our example. On the discrete level, h# is represented 
by A y and the main computational work in each iteration is the solve with A and A T . 
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4.9. A Preconditioned Krylov Subspace Method. Linear systems like the above (4.31) can be solved with 
Krylov subspace methods, e.g., the well-known Krylov subspace method for general matrices GMRES, see [23]. 
However, with ill-conditioned problems, as the one given in Equations (4.2) and (4.3), the number of steps these 
methods require can be as high as the dimension of the linear system, if they do not fail altogether. A high number 
of steps usually presents an unacceptable computational effort. However, Krylov subspace methods can be very fast 
and efficient for well-conditioned systems, cf. [25], [29]. Under certain assumptions, see [6], superlinear convergence 
can be proven for GMRES. In the following we will see that the results furnished by the local mode analysis for the 
modified system defect correction can enhance the performance of preconditioned GMRES iterations. 

The preconditioner we use is closely related to the splittings we propose for the defect correction; it is in fact 
identical to the splitting matrix K introduced in Section 3.2.3. Thus, the linear system (4.31) is replaced by the 
preconditioned system 

(4.35) K~ l Kx n = K~ l rjv . 

In each iteration of the preconditioned GMRES, the matrix-vector-product K 1 K z = z + must be computed. This 
can be done successively by solving three linear subsystems, in a similar way as described in Section 3.2.4. In this 
respect, the work required in one GMRES iteration is roughly the same as in one defect correction iteration, i.e., one 
solve with A and one solve with A T (see [4]). However, the implementation of GMRES is more difficult than that of 
the defect correction process. For example, re-orthogonalization, which can be very costly as well, is often necessary. 
In addition, storage requirements increase as the iteration progresses, thus rendering the method unattractive of very 
large problems. For these issues see, e.g., [23], [14]. 

The eigenvalues of the preconditioned system matrix K~ l K are bounded below in absolute value by 1. The 
number of eigenvalues distinct from 1 are at most n, where n is the number of design variables. For these results see [4]. 
Since the performance of GMRES, similar to that of other Krylov subspace methods, depends on the eigenvalue 
distribution of the underlying system matrix (see, e.g., [6]), the theory indicates that GMRES will take not more than 
n + 1 steps. The numerical results are described in the following Section 4.10. 

4.10. Numerical Results. In the numerical tests, we did not restrict the design variable, u(x), to the subspace of 
sin functions, thus extending the computations beyond the theoretical treatment in Section 4.3. 

We show results of the defect correction process for the case of one inner iteration as described in Section 3.2.3. 
The system is modified with Pi and P 2 defined in (4.33) and (4.34), respectively. For the parameter M^, the values 
0.0, 0.1, 0.5, 0.9 are used in the computations. We consider the combination of cost function parameters rji = 1.0, 
T )2 = 0.0, and 771 = 1 . 0,772 = —1.0. For the same combinations of parameters, GMRES is tested on the preconditioned 
system K~ l K. We do not give the convergence history for unpreconditioned GMRES, but state that the number of 
iterations required for convergence is almost equal to the dimension 2m + n of the system in all considered cases. 

In Figures 5.1 and 5.2, typical convergence behavior of the defect correction process and preconditioned GMRES 
can be seen. The figures depict the main results, the small and mesh-independent convergence rates of the defect 
correction process that are exhibited right from the beginning of the iterations, and the improvements in GMRES 
achieved with the suggested preconditioning. 

In the tables, results are shown for systems of dimension 54, which corresponds to a 4 x 4 grid, up to dimension 
8514, which is a grid of 64 x 64 points. We always chose the discretization in y-direction equal to the discretization 
in ^-direction and refer to this number by the dimension n of the design space. The number m of state variables is 
given as m = (n -f l) 2 . 

Stopping criterion for all iterations is a threshold 10“ 5 for the ^-norm of the residual, i.e., we stop if the following 
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requirement is met, 


\\Kx n - r N ||, 2 


1 

2m + n 


2m+n 

T. (Kx N - r N )? < 10" 

i+l 


Performance of the defect correction process is shown in Tables 5.3 and 5.4. In the considered cases we only allow 
for one inner iteration. For the choice of cost function parameters rji = 1.0, % = 0.0 in Table 5.3, the terms L+u, L U(t> 
in K vanish. This does not only simplify the convergence analysis, but also often admits a faster numerical solution 
than the second choice of nonzero This is easily seen by a comparison with Table 5.4. The dimension of the 
design space, n, and of the entire system are given together with the numerical results for Pi and P 2 defined in (4.33) 
and (4.34), respectively. For both choices of P, the number of steps until solution and the CPU required by the iterative 
process are given. The convergence rate, the ratio of successive errors, is the asymptotic convergence rate valid at the 
end of the defect correction iterations. This rate is approximately a constant throughout the defect correction process 
for given parameters and grid. The largest eigenvalue of the matrix M defined by the splitting is a close upper bound 
for the convergence rate. The computations are done for four different parameters M x , M = 0.0, 0.1, 0.5, 0.9. 
Although the original system becomes increasingly ill-conditioned as Afyo approaches 1, the modified system defect 
correction still performs well for = 0.9. The convergence of this defect correction is mesh-independent. The 
asymptotic convergence rates, in the limit of mesh-size going to zero, furnished by the local mode analysis for each 
specific combination of parameters is given in Table 5.5. The discrepancy between the results of the local mode 
analysis and the actual convergence rates of the defect correction process is due to the fact that the considered domain 
is finite, which is not taken into account by the local mode analysis performed here. 

Performance of the preconditioned GMRES is shown in Tables 5.2 and 5.1. Again, the dimensions of the design 
space and of the entire system are given together with the number of steps until solution and the required CPU. It 
can be seen that the required CPU times for iterations of preconditioned GMRES and of the defect correction process 
are of the same order of magnitude. Since the convergence rate of GMRES is in general not constant throughout the 
iteration, it is not considered in the tables. Qualitatively, the convergence rate is depicted in Figure 5.1. 

In our computations with GMRES we have not only considered Pi and P 2 given in (4.33) and (4.34) , but also 
P = 0, i.e., the preconditioner K with the entry L u o = L uu . It can be seen that the number of steps with this 
preconditioner roughly equals n/2 (Table 5.2, = 0, L u<s> = 0) or n (Table 5.1, L^ u / 0, L u<p £ 0), respectively. 

Investing the computational effort of introducing Pi or P 2 as suggested by the local mode analysis pays out in a low 
and mesh-independent number of iterations. 

All computations were done with Matlab on a SUN 2 X UltraSPARC-II with 2Gb RAM. 


5. Discussion and Concluding Remarks. We propose a modified defect correction process to solve efficiently 
the (KKT-)system of equations composed of the necessary optimality conditions for optimization problems governed 
by state equations. The new method is simple to apply and embed into existing codes. It requires to solve successively 
the linearized state and the costate (adjoint) equations with different right hand sides in each iteration. These linear 
equations are obtained by first modifying the KKT system, JC, with the introduction of a “preconditioning” operator 
P, and then splitting the system into two parts, K. = K.('P) + TZi'P ). The solution is obtained by a defect correction 
process that requires one solve with the operator /C(P) in each iteration. Convergence theory is provided in the 
paper for the different splittings that we propose. We also suggest to use the operator K.(V) as a preconditioner 
for GMRES. The introduction of the operator V is crucial for fast convergence. We advocate to use the structure 
induced by the governing PDE. This can be done by a local mode analysis of either the PDE, or of its discretized 
equations. We test both the defect correction process and the preconditioned GMRES on a model problem that mimics 
an aerodynamic shape optimization problem. We obtain for both methods fast and mesh-independent convergence. 
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The numerical results are consistent both with the convergence theory in the paper and with the approximated local 
mode analysis for the asymptotic convergence rate. An extension of the approach to optimization problems with 
inequality constraints will be treated elsewhere. Another application might be the area of multidisciplinary design 
and optimization problems. When considering a large system of equations, obtained for example in those problems, 
a splitting of the KKT system can be applied twice, once to decouple the large multidisciplinary KKT system into 
subsystems [1], and a second time to solve efficiently each of the subsystems. Such a method might require the 
introduction of an operator V for the different subsystems as well as for the large system. 
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Fig. 5 .1. Residual development of preconditioned GMRES for grid sizes n = 32 (upper figure) and n = 64 (lower figure). (Problem 
parameters: r?i = 1.0, rfc = 0.0, Moo = 0.5) 



Iterations 


FIG. 5.2. Residual development of defect correction process for grid sizes n ~ 32 (upper figure ) and n = 64 (lower figure). ( Problem 
parameters: tji = 1.0, 772 = 0.0, Moo =0.5) 
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Table 5.1 

Performance of preconditioned GMRES for cost function parameters r)\ — 1 . 0 , 772 = — 1 . 0 . 


(Moo = 0.0) 



p b 

Pi 

P2 

n 

dim 

#it 

CPU in 5 

#it 

CPU in 5 

#it 

CPU in 5 

4 

54 

6 

2.00e-01 

6 

9.00e-02 

6 

9.00e-02 

8 

170 

10 

1.80e-01 

6 

9.00e-02 

7 

1.00e-01 

16 

594 

18 

8.20e-01 

6 

4.10e-01 

7 

4.40e-0 1 

32 

2210 

32 

6.60e+00 

7 

3.10e+00 

6 

3.12e+00 

64 

8514 

57 | 

8.26e+01 

7 

3.29e+01 

6 

3.17e+01 


(Moo = 0.1) 



p b 

Pi 

P2 

n 

dim 

#it 

CPU in 5 

#it 

CPU in 5 

#it 

CPU in 5 

4 

54 

6 

8.00e-02 

6 

4.00e-02 

6 

4.00e-02 

8 

170 

10 

1.70e-01 

6 

9.00e-02 

7 

1.00e-01 

16 

594 

18 

7.50e-01 

6 

3.80e-01 

7 

4.00e-01 

32 

2210 

32 

6.51e+00 

7 

2.64e+00 

6 

2.72e+00 

64 

8514 

57 

8.36e+01 

7 

2.46e+01 

6 

2.48e+01 


(Moo = 0.5) 



p b 

Pi 

p 2 

n 

dim 

#it 

CPU in 5 

#it 

CPU in 5 

#it 

CPU in 5 

4 

54 

6 

8.00e-02 

6 

5.00e-02 

6 

5.00e-02 

8 

170 

10 

1.70e-01 

6 

8.00e-02 

6 

l.OOe-Ol 

16 

594 

18 

7.50e-01 

6 

3.70e-01 

6 

4.00e-01 

32 

2210 

33 

6.71e+00 

6 

2.63e+00 

6 

2.74e+00 

64 

8514 

59 

8.68e+01 

6 

2.38e+01 

6 

2.52e+01 


(Moo = 0.9) 



p b 

Pi 

P2 

n 

dim 

#it 

CPU in s 

#it 

CPU in s 

#it 

CPU in s 

4 

54 

6 

9,00e-02 

5 

5.00e-02 

5 

5.00e-02 

8 

170 

10 

1.60e-01 

6 

8.00e-02 

6 

9.00e-02 

16 

594 

18 

7.20e-01 

6 

3.20e-01 

6 

3.70e-01 

32 

2210 

34 

6.75e+00 

6 

2.48e+00 

6 

2.54e+00 

64 

8514 

66 

9.37e+01 

6 

2.32e+01 

6 

2.43e+01 
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Table 5.2 

Performance of preconditioned GMRES for cost function parameters t}\ = 1.0, r } 2 — 0.0. 


(Moo = 0.0) 



^ b 

Pi 

p 2 

n 

dim 

#it 

CPU in s 

#it 

CPU in 5 

#it 

CPU in s 

4 

54 

4 

8.00e-02 

4 

4.00e-02 

4 

3.00e-02 

8 

170 

6 

1.30e-01 

6 

8.00e-02 

4 

6.00e-02 

16 

594 

10 

5.30e-01 

! 

6 

3.40e-01 

4 

3.00e-01 

32 

2210 

18 

4.19e+00 

5 

2.55e+00 

4 

2.18e+00 

64 

8514 

33 

5.30e+01 

4 

2.34e+01 

4 

2.15e+01 


(Moo = o.l) 



^ b 

Pi 

p 2 

n 

dim 

#it 

CPU in s 

#it 

CPU in s 

#it 

CPU in s 

4 

54 

4 

8.00e-02 

4 

4.00e-02 

4 

3.00e-02 

8 

170 

6 

1.30e-01 

6 

8.00e-02 

4 

6.00e-02 

16 

594 

10 

5.30e-03 

6 

3.50e-01 

4 

2.90e-01 

32 

' 2210 

18 

4.23e+00 

5 

2.49e+00 

4 

2.20e+00 

64 

8514 

33 

5.31e+01 

4 

2.35e+01 

4 

2.14e+01 


(Moo = 0.5) 



p b 

Pi 

P 2 

n 

dim 

#it 

CPU in 5 

#it 

CPU in 5 

#it 

CPU in 5 

4 

54 

4 

8.00e-02 

4 

4.00e-02 

4 

4.00e-02 

8 

170 

6 

1.30e-01 

6 

8.00e-02 

4 

7.00e-02 

16 

594 

10 

5.20e-01 

6 

3.60e-01 

4 

2.90e-01 

32 

2210 

18 

4.21e+00 

5 

2.48e+00 

4 

2.15e+00 

64 

8514 

34 

5.35e+01 

4 

2.36e+01 

4 

2.15e+01 


(Moo = 0.9) 



p b 

Pi 

p 2 

n 

dim 

#it 

CPU in 5 

#it 

CPU in 5 

#it 

CPU in s 

4 

54 

4 

7.00e-02 

4 

3.00e-02 

4 

4.00e-02 

8 

170 

6 

1.30e-01 

5 

8.00e-02 

5 

7.00e-02 

16 

594 

10 

5.20e*01 

5 

3.20e-01 

4 

3.10e-01 

32 

2210 

18 

4.20e+00 

5 

2.37e+00 

4 

2.28e+00 

64 

8514 

34 

5.53e+01 

5 

2.29e+01 

4 

2.25e+01 
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Table 5.3 

Performance of DCP for cost Junction parameters tji = 1.0, 772 = 0.0. 


(Moo = 0.0) 



1 1 

1 ^ 

n 

dim 

#it 

CPU in 5 

conv.rate 

#it 

CPU in 5 

conv.rate 

4 

54 

12 

1.70e-01 

4.4219e-01 

4 

5.00e-02 

8.9447e-03 

8 

170 

12 

2.70e-01 

4.8270e-01 

4 

8.00e-02 

7.2777e-03 

16 

594 

12 

1.03e+00 

4.8478e-01 

4 

3.10e-01 

6.885 8e-03 

32 

2210 

11 

6.98e+00 

4.8348e-01 

4 

2.69e+00 

6.7893e-03 

64 

8514 

10 

6.89e+01 

4.81 18e-01 

3 

2.76e+01 

6.765 3e-03 


(Moo - 0.1) 



p 1 

p 2 

n 

dim 

#it 

CPU in s 

conv.rate 

#it 

CPU in 5 

conv.rate 

4 

54 

12 

1.60e-01 

4.3988e-01 

4 

5.00e-02 

9.2148e-03 

8 

170 

12 

2.70e-01 

4.8033e-01 

4 

8.00e-02 

7.51 13e-03 

16 

594 

12 

1.02e+00 

4.8238e-01 

4 

3.20e-01 

7.1 105e-03 

32 

2210 

11 

6.92e+00 

4.8109e-01 

4 

2.69e+00 

7.01 18e-03 

64 

8514 

10 

6.99e+01 

4.7879e-01 

3 

2.79e+01 

6.9873e-03 


(Moo = 0.5) 



Pi 

P 2 

n 

dim 

#it 

CPU in 5 

conv.rate 

#it 

CPU in s 

conv.rate 

4 

54 

11 

1.50e-01 

3.7660e-01 

4 

5.00e-02 

1.9846e-02 

8 

170 

11 

2.70e-01 

4.1315e-01 

4 

8.00e-02 

1.6904e-02 

16 

594 

10 

9.20e-01 

4.1261e-01 

4 

3.10e-01 

1.6202e-02 

32 

2210 

10 

6.58e+00 

4.1312e-01 

4 

2.65e+00 

1 .6028e-02 

64 

8514 

9 

6.65e+01 

4.1044e-01 

4 

3.06e+01 

1.5985e-02 


(Moo = 0.9) 



Pi 

P 2 

n 

dim 

#it 

CPU in s 

_ __ 

conv.rate 

#it 

CPU in s 

conv.rate 

4 

54 

10 

1.40e-01 

2.453 le-01 

10 

9.00e-02 

2.4010e-01 

8 

170 

9 

2.20e-01 

2.1838e-01 

10 

1.50e-01 

2.2788e-01 

16 

594 

9 

8.50e-01 

2.1 129e-01 

9 

5.00e-01 

2.2484e-01 

32 

2210 

9 

6.16e+00 

2.0949e-01 

9 

3.91e+00 

2.2409e-01 

64 

8514 

9 

6.53e+01 

2.0904e-01 

9 

4.31e+01 

2.2390e-01 
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Table 5.4 

Performance of DCP for cost function parameters r}\ — 1.0 , 172 = —1.0. 


(Afoo = 0.0) 



Pi 

p 2 

n 

dim 

#it 

CPU in s 

co nv. rate 

#it 

CPU in 5 

conv.rate 

4 

54 

13 

2.90e-01 

3.8568e-01 

12 

1.60e-01 

3.5942e-01 

8 

170 

17 

3.70e-01 

4.8401 e-01 

17 

2.20e-01 

4.9946e-01 

16 

594 

20 

1.62e+00 

5.2462e-01 

20 

9.40e-01 

5.5028e-01 

32 

2210 

21 

1.17e+01 

5.3696e-01 

21 

7.19e+00 

5.6899e-01 

64 

8514 

21 

1.19e+02 

5.4020e-01 

21 

7.51e+01 

5.765 le-01 


(M^ = 0.1) 



Pi 

P 2 

n 

dim 

#it 

CPU in 8 

conv.rate 

#it 

CPU in s 

conv.rate 

4 

54 

13 

1.80e-01 

3.8366e-01 

12 

1.00e-01 

3.5812e-01 

8 

170 

17 

3.70e-01 

4.8193e-01 

17 

2.30e-01 

4.9748e-01 

16 

594 

20 

1.58e+00 

5.2247e-01 

20 

9.40e-01 

5.4815e-01 

32 

2210 

21 

1.13e+01 

5.3484e-01 

21 

6.94e+00 

5.668 le-01 

64 

8514 

21 

1.13e+02 

5.381 le-01 

21 

7.28e+01 

5.7432e-01 


(Moo = 0.5) 



Pi 

P 2 

n 

dim 

#it 

CPU in s 

conv.rate 

#it 

CPU in 5 

conv.rate 

4 

54 

12 

1.70e-01 

3.3273e-01 

11 

1.00e-01 

3.2583e-01 

8 

170 

15 

3.40e-01 

4.296 le-01 

15 

2.00e-01 

4.4778e-01 

16 

594 

17 

1.39e+00 

4.684 le-01 

17 

8.50e-01 

4.9393e-01 

32 

2210 

18 

9.90e+00 

4.8126e-01 

18 

6. 1 8e+00 

5.1 1 15e-01 

64 

8514 

19 

1 .04e+02 

4.8552e-01 

18 

6.56e+01 

5.1815e-01 


(Moo = 0.9) 



Pi 

P 2 

n 

dim 

#it 

CPU in 5 

conv.rate 

#it 

CPU in s 

conv.rate 

4 

54 

14 

1.90e-01 

3.6261e-01 

14 

1.10e-01 

3.7223e-01 

8 

170 

15 

3.40e-01 

4.1422e-01 

16 

2.20e-01 

4.2473e-01 

16 

594 

16 

1.30e+00 

4.3635e-01 

16 

7.90e-01 

4.467 le-01 

32 

2210 

16 

9.07e+00 

4.4537e-01 

16 

5.65e+00 

4.5569e-01 

64 

8514 

16 

9.34e+01 

4.4926e-01 

16 

6.00e+01 

4.5958e-01 
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Table 5.5 

LMA Prediction of asymptotic convergence rates for the DCP. 


(rji = 1.0 in all cases) 



Pi 

P2 

M r oo 

§! 

ii 

S' 

% = -1.0 

o 

o 

II 

£ 

3 : 

II 

1 

►— * 
b 

0.0 

0.5000 

0.5781 

0.0 

0.5781 

0.1 

0.4975 

0.5757 

0.0 

0.5757 

0.5 

0.4286 

0.5124 

0.0 

0.5124 

0.9 

0.1597 

0.2723 

0.0 

0.2723 
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